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We give a systematic expansion of the crypticity — a recently introduced measure of the inacces- 
sibihty of a stationary process's internal state information. This leads to a hierarchy of fc-cryptic 
processes and allows us to identify finite-state processes that have infinite crypticity — the internal 
state information is present across arbitrarily long, observed sequences. The crypticity expansion is 
exact in both the finite- and infinite-order cases. It turns out that fc-crypticity is complementary to 
the Markovian finite-order property that describes state information in processes. One application 
of these results is an efficient expansion of the excess entropy — the mutual information between a 
process's infinite past and infinite future — that is finite and exact for finite-order cryptic processes. 
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INTRODUCTION 

The data of phenomena come to us through observa- 
tion. A large fraction of the theoretical activity of model 
building, though, focuses on internal mechanism. How 
are observation and modeling related? A first step is to 
frame the problem in terms of hidden processes — internal 
mechanisms probed via instruments that, in particular, 
need not accurately report a process's internal state. A 
practical second step is to measure the difference between 
internal structure and the information in observations. 

We recently established that the amount of observed 
information a process communicates from the past to the 
future — the excess entropy — is the mutual information 
between its forward- and reverse-time minimal causal 
representations [l|, 0] ■ This closed- form expression gives 
a concrete connection between the observed information 
and a process's internal structure. 

Excess entropy, and related mutual information quan- 
tities, are widely used diagnostics for complex systems. 
They have been applied to detect thepresence of orga- 
nization in dynamical systems 0, 0) H in spin sys- 
tems ,7, ^, i9,], in neurobiological systems [l^, 11 1, and 
even in language [13, [II, to mention only a very few uses. 
Thus, understanding how much internal state structure 
is reflected in the excess entropy is critical to whether or 
not these and other studies of complex systems can draw 
structural inferences about the internal mechanisms that 
produce observed behavior. 

Unfortunately, there is a fundamental problem. The 
excess entropy is not the internal state information the 
process stores — rather, the latter is the process's statis- 
tical complexity [J [3]. On the positive side, there is a 
diagnostic. The difference between, if you will, experi- 
ment and theory (between observed information and in- 
ternal structure) is controlled by the difference between 
a process's excess entropy and its statistical complex- 
ity. This difference is called the crypticity — how much 



internal state information is inaccessible [J, [3| . Here we 
introduce a classification of processes using a systematic 
expansion of crypticity. 

The starting point is computational mechanics^ s min- 
imal causal representation of a stochastic process V — 
the e-machine [l3. [l5|. There, a process is viewed as a 
channel that communicates information from the past, 
X = ... X.3X_2X-i, to the future, X = X0X1X2 . . .. 
{Xt takes values in a finite measurement alphabet A.) 
The excess entropy is the shared (or mutual) informa- 
tion between the past and the future: E — I[X; X]. The 
amount of historical information that a process stores 
in the present is different. It is given by the Shan- 
non information = H[S] of the distribution over the 
e-machine's causal states S. is called the statistical 
complexity and the causal states are sets of pasts x that 
are equivalent for prediction 14]: 



e(V) = {V : Pr(X\lE) = Pr(X|V)} . 



(1) 



Causal states have a Markovian property that they ren- 
der the past and future statistically independent; they 
shield the future from the past [isj : 

Pr{X, X\S) = Pr{X\S) Pi(x\S) . (2) 

e-Machines are also unifilar [l^[l6|: From the start state, 
each observed sequence . . . X-3X-2X-1 . . . corresponds to 
one and only one sequence of causal states. The signature 
of unifilarity is that on knowing the current state and 
measurement, the uncertainty in the next state vanishes: 
H[St+i\St,Xt]=0. 

Although they are not the same, the basic relationship 
between these quantities is clear: E is the process's ef- 
fective channel utilization and is the sophistication of 
that channel. Their difference, one of our main concerns 
in the following, indicates how a process stores, manipu- 
lates, and hides internal state information. 

Until recently, E could not be as directly calculated 
from the e-machine as the process's entropy rate /i^ and 
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its statistical complexity. Ref. [l| and Ref. 0] solved this 
problem, giving a closed-form expression for the excess 
entropy: 

E^I[S+;S-], (3) 

where 5"*" are the causal states of the process scanned in 
the "forward" direction and are the causal states of 
the process scanned in the "reverse" time direction. 

This result comes in a historical context. Some 
time ago, an explicit expression for the excess entropy 
had been developed from the Hamiltonian for one- 
dimensional spin chains with range- i? interactions Q: 

I] = C^-Rh^. (4) 

A similar, but slightly less compact form is known for 
order- i? Markov processes: 

E = H[X^]~Rh^, (5) 

where = Xq, . . . , Xr-i. It has also been known for 
some time that the statistical complexity is an upper 
bound on the excess entropy [l6| : 

E<C^ , 

which follows from the equality derived there: 

I] = C^-H[S+\X] . 

Using forward and reverse e-machines, Ref. f]\ ex- 
tended this, deriving the closed-form expression for E in 
Eq. ^ and two new bounds on E: E < and E < C^. 
It also showed that: 

H[S+\X] = H[S+\S-] (6) 

and identified this quantity as controlling how a process 
hides its internal state information. For this reason, it is 
called the process's crypticity: 

X+^H[S+\X]. (7) 

In the context of forward and reverse e-machines, one 
must distinguish two crypticities; depending on the scan 
direction one has: 

X+ = H[S+\S-] or 
X-=H[S+\S-] . 

In the following we will not concern ourselves with reverse 
representations and so can simplify the notation, using 
Cfj, for C+ and x for X'^ ■ 

Here we show that, for a restricted class of processes, 
the crypticity in Eq. ([6]) can be systematically expanded 
to give an alternative closed-form to the excess entropy 
in Eq. ([3]). One ancillary benefit is a new and, we argue, 
natural hierarchy of processes in terms of information 
accessibility. 



K-CRYPTICITY 

The process classifications based on spin-block length 
and order- i? Markov are useful. They give some insight 
into the nature of the kinds of process we can encounter 
and, concretely, they allow for closed-form expressions for 
the excess entropy (and other system properties). In a 
similar vein, we wish to carve the space of processes with 
a new blade. We define the class of k-cryptic processes 
and develop their properties and closed-form expressions 
for their excess entropies. 

For convenience, we need to introduce several short- 
hands. First, to denote a symbol sequence that begins 
at time t and is L symbols long, we write X^. Note that 
Xj" includes Xt+L~i, but not Xt+L- Second, to denote a 
symbol sequence that begins at time t and continues on 
to infinity, we write Xt- 

Definition. The /c-crypticity criterion is satisfied when 

H[Sk\X o]=0. (8) 

Definition. A fc-cryptic process is one for which the pro- 
cess 's e-machine satisfies the k-crypticity criterion. 

Definition. An cx)-cryptic process is one for which the 
process's e-machine does not satisfy the k-crypticity cri- 
terion for any finite k. 

Lemma 1. _ff[iSfc|Xo] is a nonincreasing function of k. 

Proof. This follows directly from stationarity and the fact 
that conditioning on more random variables cannot in- 
crease entropy: 

H[Sk+i\Xo] = [Sk\X^i] < H[Sk\Xo] . 

□ 

Lemma 2. IfV is k-cryptic, then V is also j -cryptic for 
all j > k. 

Proof. Being fc-cryptic implies H[Sk\Xo] = 0. Applying 

Lem. [H H[Sj\Xo] < H[Sk\Xo] ^ 0. By positivity of 
entropy, we conclude that V is also j-cryptic. □ 

This provides us with a new way of partitioning the 
space of processes. We create a parametrized class 
of sets {xk ■■ k = 0,1,2,...}, where Xk = {V : 
fc-cryptic and not (fc — l)-cryptic}. 

The following result provides a connection to a very 
familiar class of processes. 

Proposition 1. If a process V is order-k Markov, then 
it is k-cryptic. 

Proof, li Vis order-fc Markov, then H[Sk\X^] = 0. Con- 
ditioning on more variables does not increase uncertainty, 
so: 

H[Sk\X^,Xk]=0 . 
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But the lefthand side is H[Sk\XQ]. Therefore, V is 
/c-cryptic. □ 

Note that the converse of Prop. [T] is not true. For ex- 
ample, the Even Process (EP), the Random Noisy Copy 
Process (RnC), and the Random Insertion Process (RIP) 
(see Ref. [l[ and Ref. Q), are all 1-cryptic, but are not 
order- i? Markov for any finite R. 

Note also that Prop. [1] does not preclude an order-A: 
Markov process from being j-cryptic, where j < k. Later 
we will show an example demonstrating this. 

Given a process, in general one will not know its cryp- 
ticity order. One way to investigate this is to study the 
sequence of estimates of x at different orders. To this 
end, we define the fc-cryptic approximation. 

Definition. The fc-cryptic approximation is defined as 
x{k) = H[So\X^,Sk] . 

The fe-Cryptic Expansion 

We will now develop a systematic expansion of x to or- 
der k in which x(fc) appears directly and the fc-crypticity 
criterion plays the role of an error term. 

Theorem 1. The process crypticity is given by 

X^x{k)+H[Sk\Xo] . (9) 



Proof. We calculate directly, starting from the definition, 
adding and subtracting the fc-crypticity criterion term 
from x's definition, Eq. (O: 

X = H[So\Xo] - H[Sk\Xo] + H[Sk\Xo] . 

We claim that the first two terms are x(^)- Expanding 
the conditionals in the purported x(^) terms and then 
canceling, we get joint distributions: 

H[So\Xo] - H[Sk\Xo] = H[So, Xo] - H[Sk,Xo] . 

Now, splitting the future into two pieces and using this 
to write conditionals, the righthand side becomes: 

H[Xk\So, Xq] + H[So, Xq] — H[X k\Sk, Xq] — H[Sk,XQ] . 

Appealing to the e-machine's unifilarity, we then have: 

H[Xk\Sk] + H[So,Xl;] - H[Xk\Sk,X^] - H[Sk,X^] . 

Now, applying causal shielding gives: 

H[Xk\Sk] + H[So, X^] - H[Xk\Sk] - H[Sk,X^] . 

Canceling terms, this simplifies to: 

H[So,Xq] - H[Sk,XQ] . 



We now re-expand, using unifilarity to give: 

H[So, , iSfc] — H[Sk,XQ] . 

Finally, we combine these, using the definition of condi- 
tional entropy, to simplify again: 

H[So\Xl^,Sk] ■ 

Note that this is our definition of xi^)- 
This establishes our original claim: 

X = xik) + H[Sk\Xo] , 

with the fc-crypticity criterion playing the role of an ap- 
proximation error. 

□ 

Corollary 1. A process V is k- cryptic if and only if 



Proof. Given the order-fc expansion of x just developed, 
we now assume the fc-crypticity criterion is satisfied; viz., 
H[Sk\Xo\ = 0. Thus, we have from Eq. 

X = X{k) ■ 

Likewise, assuming x — x(^) requires, by Eq. Q that 
H[Sk\Xo] = and thus the process is fc-cryptic. □ 

Corollary 2. For any process, x(0) — 0- 
Proof. 

x{0) = H[So\XS,So] 
= H[So\So] = . 

□ 

Convergence 

Proposition 2. The approximation x(.k) is a nonde- 
creasing function of k. 

Proof. Lem. [T] showed that H[Sk\Xo] is a nonincreasing 
function of fc. By Thm.[Tl x(^) niust be a nondecreasing 
function of fc. □ 

Corollary 3. Once x{k) reaches the value x, x(j) = X 
for all j > k. 

Proof. If there exists such a fc, then by Thm.[l]the process 
is fc-cryptic. By Lem. [21 the process is j-cryptic for all 
j > fc. Again, by Thm. [J xU) = X- □ 

Corollary 4. // there is a k > 1 for which xi^) — 0, 
then x(l) = 0. 
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Proof. By positivity of the conditional entropy 
H[So\Xo,Si], > 0. By the nondecreasing property 
of x(fc) from Prop. [U x(l) < x(^) = 0. Therefore, 

x(i) = o. □ 

Corollary 5. //x(l) = 0, then x{k) = for all k. 

Proof Applying stationarity, x(l) = H[So\Xo,Si] = 
H[Sk\Xk,Sk+i]- We are given x(l) = and so 
H[Sk\Xk,Sk+i] — 0. We use this below. Expanding 

Xik + l), 

x{k + l) = H[So\X^+\Sk+i] 
— HISoIXq , Xk,Sk+i] 
= H[So\Xq ,Sk,Xk,Sk+i] 
< H[So\Xq, Sk] 
= X{k) ■ 

The third line follows from x(l) — 0. By Prop. [21 x{k + 
1) > x{k)- Therefore, x{k + 1) = x{k)- Finally, using 
x(l) = 0, we have by induction that x{k) = for all 
k. □ 

Corollary 6. // there is a k > 1 for which xik) = 0, 
then x(j) = for all J > 1. 

Proof. This follows by composing Cor. m with Cor. [5l □ 

Together, the proposition and its corollaries show that 
x{k) is a nondecreasing function of k which, if it reaches 
X at a finite fc, remains at that value for all larger k. 

Proposition 3. The cryptic approximation xik) con- 
verges to X as fc —> oo. 

Proof. Note that x = '^^T^k~>oo H[So\Xq] and recall that 
x{k) — H[So\XQ,Sk]. We show that the difference ap- 
proaches zero: 

H[So\X^]-H[So\X^,Sk] 

= [5o, ] - H[Xq] 

-H[So,X^,Sk]+H[X^,Sk] 

= Il\S^,X^\ - i?[Xo] 

= B{Xl,S^^\~B{Xl\ 
= ii\Su\X^\ ■ 

Moreover, limA;_^oo -ff [^tl^o] = by the e map from 
pasts to causal states of Eq. (IlJ. Therefore, as fc — *■ oo, 
X{k) ^ X- □ 



Excess Entropy for fc-Cryptic Processes 

Given a fc-cryptic process, we can calculate its excess 
entropy in a form that involves a sum of oc |^'^| terms, 
where each term involves products of k matrices. Specif- 
ically, we have the following. 



Corollary 7. A process V is k- cryptic if and only if 
E-C^-X(fc)- 

Proof. From Ref. [H , we have E = — x, and by Cor. [TJ 
X~x{k)- Together, these complete the proof. □ 

The following proposition is a simple and useful con- 
sequence of the class of fc-cryptic processes. 

Corollary 8. A process V is 0-cryptic if and only if 



E 



Proof. If V is 0-cryptic, our general expression then reads 
E = - H[So\Xq,So] 

= c, . 

To establish the opposite direction, E = and Cor. [7] 
imply that xik) = for all k. In particular, x(0) and the 
process is 0-cryptic. □ 



Crypticity versus Markovity 

Equation ^ and Equation ([5]) give expressions for E 
in the cases when the process is order-i? Markov and 
when it is an order-i? spin chain. These results hinge on 
whether or not H[X^] = C^. 

Reference stated a condition under which equality 
holds in terms of transfer matrices. Here we state a sim- 
pler condition by equating two chain rule expansions of 
H[XI^',Sr]: 

H[X^\Sn] + H[Sn] = H[Sr\X^] + i/[X(f ] . 

H[Sb\X^] = by virtue of the fact that each such (his- 
tory) word maps to exactly one causal state by Eq. ([T]). 
Thus, we conclude that for order-i? Markov processes: 



H[X^] 



H[X^\Sb] = 



So, an order-i? Markov process is also a spin chain if and 
only if H[Xq \Sr\ = 0. This means that there is a 1 — 1 
correspondence between the i?-blocks and causal states, 
confirming the interpretation specified in Ref. 

We can also extend the condition for H[Xq] — to 
the results presented here in the following way. 

Proposition 4. 

h[x^\Sb] = q ^ xiR)^RK, (10) 

where /i^ is the process 's entropy rate. 
Proof. The proof is a direct calculation: 
x(i?)=ii[5o|Xo^,5fl] 

= H[Sa, Xq] — H[Xq, Sii\ 

= H[So, Xq] 

= ii'[<So, Xq] 

= H[Xo\M 



\Sr\ 



H[Sr\ 



H[X^ 



^ Rh^ - H[X^\Sr] 
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□ 

Proposition 5. Periodic processes can be arbitrary or- 
der-R Markov, but are all 0-cryptic. 

Proof. According to Ref. [l3|, we have E = C^. By Cor. [5] 
the process is 0-cryptic. □ 

Proposition 6. A positive entropy-rate process that is 
an order-R Markov spin chain is not {R — l)-cryptic. 

Proof. Assume that the order-i? Markov spin chain is 
{R — l)-cryptic. 

For i? > 1, If the process is {R ~ l)-cryptic, then by 
Cor. [T] x(i? — 1) = X- Combining this with the above 
Prop. H we have x(-R-l) = {R-l)hf, - H[X^-^\Sr-i]. 
If it is an order-i? Markov spin chain, then we also have 
from Eq. ^ that x = Rh^. Combining this with the 
previous equation, we find that H[XQ ~^\Sii-i] = —h^. 
By positivity of conditional entropies, we have reached a 
contradiction. Therefore an order-i? Markov spin chain 
must not be {R — l)-cryptic. 

For i? = 0, the proof also holds since negative cryptic 
orders are not defined. □ 

Proposition 7. A positive entropy-rate process that is 
an order-R Markov spin chain is not (i? — n)- cryptic for 
any 1 > n > R. 

Proof. For R > 1, By Lem. [21 if the process were 
{R — n)-cryptic for some 1 > n > R, then it would be 
(i? — l)-cryptic. By Prop. [51 this is not true. Therefore, 
the primitive orders of Markovity and crypticity are the 
same. Similarly, for i? = 0, the proof also holds since 
negative cryptic orders are not defined. □ 



blocks of uninterrupted Is are even in length, bounded by 
zeros. Further, after each even length is reached, there 
is a probability p of breaking the block of Is by inserting 
one or more Os. 



l-p\l 




FIG. 1: A 0-cryptic process: Even Process. The transitions 
denote the probability p of generating symbol x as p\x. 

Reference [1] showed that the Even Process is 0-cryptic 
with a statistical complexity of = ii (1/(2 —p)), an 
entropy rate of /i^ = H{p)/{2—p), and crypticity of x = 
0. If p = i , then CV, = log2 (3) - 1 bits and E = log2 (3) - 
I bits. (As Ref. [3| notes, these closed-form expressions 
for and E have been known for some time.) 

To see why the Even Process is 0-cryptic, note that 
if A = 0, then So = A; and if A = 1, then So = B. 
Therefore, the 0-crypticity criterion of Eq. ^ is satisfied. 

It is important to note that this process is not order-i? 
Markov for any finite R 17|. Nonetheless, our new ex- 
pression for E is valid. This shows the broadening of our 
ability to calculate E even for low complexity processes 
that are, in effect, infinite-order Markov. 



Golden Mean Process: 1-Cryptic 



EXAMPLES 

It is helpful to see crypticity in action. We now turn 
to a number of examples to illustrate how various orders 
of crypticity manifest themselves in e-machine structure 
and what kinds of processes are cryptic and so hide in- 
ternal state information from an observer. For details 
(transition matrices, notation, and the like) not included 
in the following and for cornplementary discussions and 
analyses of them, see Refs. [l|,l3,[i3. 

We start at the bottom of the crypticity hierarchy with 
a 0-cryptic process and then show examples of 1-cryptic 
and 2-cryptic processes. Continuing up the hierarchy, we 
generalize and give a parametrized family of processes 
that are /c-cryptic. Finally, we demonstrate an example 
that is oo-cryptic. 



Figure J2j shows the e-machine for the Golden Mean 
Process [17|. The Golden Mean Process is one in which 
no two Os occur consecutively. After each 1, there is a 
probability p of generating a 0. As sequence length grows, 
the ratio of the number of allowed words of length L to 
the number of allowed words at length L — \ approaches 
the golden ratio; hence, its name. The Golden Mean 
Process e-machine looks remarkably similar to that for 
the Even Process. The informational analysis, however, 
shows that they have markedly different properties. 




FIG. 2: A 1-cryptic process: Golden Mean Process. 



Even Process: 0-Cryptic 

Figure [T] gives the e-machine for the Even Process. 
The Even Process produces binary sequences in which all 



Reference [2] showed that the Golden Mean Process 
has the same statistical complexity and entropy rate 
as the Even Process: ~ ii (1/(2— p)) and h^^ = 
H{p)/{2 — p). However, the crypticity is not zero (for 
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< p < 1). From Cor.[l]we calculate: 
= H[So\XlSi] 

= h[Sq\xI] 

= Pr(0)ir[5o|Xo =0] 
= H{p)l{2-p) . 



Fr(l)i/[5o|Xo = l] 



If p - i, = log2(3) - 
E = log2(3) - 



3 bits, an excess entropy of 
i2\^) ~ I bits, and a crypticity of % = |. Thus, 
the excess entropy differs from that of the Even Process. 
(As with the Even Process, these closed-form expressions 
for and E have been known for some time.) 

The Golden Mean Process is 1-cryptic. To see why, it 
is enough to note that it is order-1 Markov. By Prop.[Tl 
it is 1-cryptic. We know it is not 0-cryptic since any 
future beginning with 1 could have originated in either 
state A or B. In addition, the spin-block expression for 
excess entropy of Ref. [17|, Eq. ^ here, applies for an 
R = 1 Markov chain. 



Butterfly Process: 2-Cryptic 

The next example, the Butterfly Process of Fig. [Sj il- 
lustrates in a more explicit way than possible with the 
previous processes the role that crypticity plays and how 
it can be understood in terms of an e-machine's struc- 
ture. Much of the explanation does not require calculat- 
ing much, if anything. 




FIG. 3: A 2-cryptic process: 
symbol alphabet. 



Butterfly Process over a 6- 



It is first instructive to see why the Butterfly Process 
is not 1-cryptic. 

If we can find a family {Izfo} such that H[Si\Xi^ — 
a; o] 7^ 0, then the total conditional entropy will be pos- 
itive and, thus, the machine will not be 1-cryptic. To 



show that this can happen, consider the future x o = 
(0, 1, 2, 4, 4, 4, . . .). It is clear that the state following 1 
must be A. Thus, in order to generate or 1 before ar- 
riving at A, the state pair (5o, Sx) can be either [B, C) or 
(D, E). This uncertainty in Si is enough to break the cri- 
terion. And this occurs for the family {x o} = {0, 1, . . .}. 

To see that the process is 2-cryptic, notice that the two 
paths {B, C) and {D, E) converge on A. Therefore, there 
is no uncertainty in <S2 given this future. It is reasonably 
straightforward to see that indeed any (Aq, Ai) will lead 
to a unique causal state. This is because the Butterfly 
Process is a very limited version of an 8-symbol order-2 
Markov process. 

Note that the transition matrix is doubly-stochastic 
and so the stationary distribution is uniform. The sta- 
tistical complexity is rather direct in this case: = 
log2(5). We now can calculate x using Cor. [TJ 

X = X(2) 

= H[So\XlS2] 
= H[So\Xl] 

= Vv{Ql)H[Sa\Xl = 01] + Vr{l2)H[So\Xl = 12] 
-f Pr(13)i7[5o|A2 = 13] 



2iil 

45 



10 



bits. 



From Cor. [7l we get an excess entropy of 
E = - x(2) 



log 2(5) 



10 



« 2.0219 bits. 

For comparison, if we had assumed the Butterfly Pro- 
cess was 1-cryptic, then we would have: 

E = - x(l) 

= C^-(i?[5o,Ao]-i/[5i,Ao]) 
« log 2(5) -(3.3219 - 2.5062) 
= log 2(5) - 0.8156 w 1.5063 bits. 

We can see that this is substantially below the true value: 
a 25% error. 



Restricted Golden Mean: fc-Cryptic 

Now we turn to illustrate a crypticity-parametrized 
family of processes, giving examples of /c-cryptic pro- 
cesses for any k. We call this family the Restricted 
Golden Mean as its support is a restriction of the Golden 
Mean support. (See Fig.|3]for its e-machines.) The k = 1 
member of the family is exactly the Golden Mean. 
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It is straightforward to see that this process is or- 
der-fc Markov. Proposition [T] then imphes it is (at most) 
fc-cryptic. In order to show that it is not (fc — l)-cryptic, 

consider the case Ieq = l'^, 0, The first {k — 1) Is will 

induce a mixture over states k and 0. The following fu- 
ture X k — 1, 0, ... is consistent with both states k and 0. 
Therefore, the (fc — l)-crypticity criterion is not satisfied. 
Therefore, it is fc-cryptic. 




FIG. 4: fc-cryptic processes: Restricted Golden Mean Family. 

For arbitrary k, there are k + 1 causal states and the 
stationary distribution is: 



1 



1 



1 



^ fc + 2 ' A: + 2 ' /j + 2 ' ■ 
The statistical complexity is 

C^ = log2(A: + 2) 



k + 2 



For the fc-th member of the family, we have for the cryp- 
ticity: 



X = Xik) 



2k 

k + 2 



And the excess entropy follows directly from Cor. [T] 



E = C„ 



X 



log2(fc + 2)- 



2(fc + 1) 
k + 2 



which diverges with k. (Calculational details will be pro- 
vided elsewhere.) 



Stretched Golden Mean 

The Stretched Golden Mean is a family of processes 
that does not occupy the same support as the Golden 
Mean. Instead of requiring that blocks of Os are of length 



1, we require that they are of length k. Here, the Markov 
order (k) grows, but the cryptic order remains 1 for all 
k. 

Again, it is straightforward to see that this process is 
order-fc Markov. To see that it is 1-cryptic, first note 
that if Xq = 1, then iSi = 0. Next consider the case 
when Xq — 0. If the future x i — 1, . . ., then Si = k. 
Similarly, if the future x i = 0", 1, . . ., then Si = k — 
n. This family exhibits arbitrary separation between its 
Markov order and its cryptic order and so demonstrates 
that these properties are not redundant. 




FIG. 5: fc-cryptic processes: Stretched Golden Mean Family. 

The stationary distribution is the same as for the Re- 
stricted Golden Mean and so, then, is the statistical com- 
plexity. In addition, we have: 



Consequently, 



X^x{l)=H[So\Xo,Si] 



The Nemo Process: oo-Cryptic 

We close our cryptic process bestiary with a (very) 
finite-state process that has infinite crypticity: The 
three-state Nemo Process. Over no finite-length sequence 
will all of the internal state information be present in the 
observations. The Nemo Process e-machine is shown in 
Fig. El 

Its stationary state distribution is 



Pr(5) = TT 



1 



A B C 
1 1 — p 1 ~ P 



3 - 2p 

from which one calculates the statistical complexity: 
C, = log2(3 - 2p) - ^1-^ log2(l - p) . 
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p\l 




FIG. 6: The oo-cryptic Nemo Process. 



The Nemo Process is not a finite-cryptic process. That 
is, there exists no finite k for which 77[iSfe|Xo] — 0. To 
show this, we must demonstrate that there exists a family 
of futures such that for each future H[Sk\Xo = Ic] > 0. 
The family of futures we use begins with all Os and then 
has a 1. Intuitively, the 1 is chosen because it is a syn- 
chronizing word for the process — after observing a 1, the 
e-machine is always in state A. Then, causal shielding 
will decouple the infinite future from the first few sym- 
bols, thereby allowing us to compute the conditional en- 
tropies for the entire family of futures. 

First, recall the shorthand: 

Pr(5fe|Xo) = lim PiiSk\X^) . 

L — >oo 

Without loss of generality, assume k < L. Then, 

PriX^) 

^ PT{X^\Sk)Pr{X^,Sk) 
Pr{X^) 

where the last step is possible since the causal states are 
Markovian [l^, shielding the past from the future. Each 
of these quantities is given by: 

Pr(X,^ = w\Sk - <7) = [T^^h]^ 

Pr{X^ =w)= nT'^'^h . 

where T^™) = ^(^0)2^(^1) . . . 1 jg a column vector 

of Is, and T^fJ = Pr(5' ^a',X = x\S = cr). To establish 
H[Sk\Xo] > for any k, we rely on using values of k 



that are multiples of three. So, we concentrate on the 
following for n = 0, 1, 2, . . .: 

iJ[53„|X3"+l = 03"l,X3„+i]>0. 

Since 1 is a synchronizing word, we can greatly simplify 
the conditional probability distribution. First, we freely 
include the synchronized causal state A and rewrite the 
conditional distribution as fraction: 

Pr(53„|X3«+i =03"l,X3„+i) 

= Pl{S3n\XQ"'^^ = 0'^"l,iS3„+i = A, Xsn+i) 
_ Pr(<53rn ^o""*^^ ^ 0'^"1, 53,1+1 = A, X3„_|_i) 

Pr(x3"+i = 03"1, 53„+i = A, X3„+i) 

Then, we factor everything except X3„+i out of the nu- 
merator and make use of causal shielding to simplify the 
conditional. For example, the numerator becomes: 

Pl'('53,i, Xq"+^ — 0'^"l,iS3„+i = A, X'in+l) 

= Pr(X3„+i |53„, Xq"^""^ ~ 0'^"l,tS3„+i ~ A) 
X Pr(53„,X3"+i - 03"l,53„+i = A) 

= Pr(X3„+i |53„+i — A) 

X Pr(53„,X3"+i = 03"l,53„+i = A) 

= Pr(X3„+i|53„+i = A) Pr(53„,X3"+i = O^"!) . 

Similarly, the denominator becomes: 

Pr(XQ"^^ ~ 0'^"l,53„+i = A, ^3„+i) 

= Pr(X3„+i|53„+i = A)Pr(X3"+i = 03"1) . 

Combining these results, we obtain a finite form for the 
entropy of S-in conditioned on a family of infinite futures, 
first noting: 

Pr(53„|X3"+i = 03"l,X3„+i) = Pr(53„|X3«+i = O^n) . 
Thus, for all l?3„+i, we have: 

ii\S-in\X^^^ ~ 0^"l,X3„+i ~ 'a?3r! + l] 

= /f[53„|X3"+i=03«l] . 

Now, we are ready to compute the conditional entropy 
for the entire family. First, note that T^°) raised to the 
third power is a diagonal matrix with each element equal 
to (1 -q). Thus, for J = 1,2,3.. .: 

M^t = (l-|')-''(1-9F • 
Using all of the above relations, we can easily calculate: 

A B C 

Pr(53„lXo3"+i = 03"+H)^^ (p g(l-p)). 
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Thus, for p,q (z (0, 1), we have: 

>iI[53„|X3«+l = 03"l,X3„+i] 

= ^ Pr(x3«+i=03"LX3„+i = ^3n+i 

X iJ[>S3„|X3-+l = 0^"!, X3„+i = ^3n+l] 
-i7[53„|X3«+l = 03"l] 

X Pr(x3"+1 =03"1,X3„+1 -^3n+l 

P , 3 - 2p g(l -p) q(l -p) 
log2 + — TT- log2 



> . 



So, any time fc is a multiple of three, H[Sk\XQ] > 0. 
Finally, suppose {k mod 3) = i, where z ^ 0. That 
is, suppose k is not a multiple of three. By Lem. [U 
H[Sk\Xo] > H[Sk+i\X o] and, since we just showed that 
the latter quantity is always strictly greater than zero, 
we conclude that 77[5fe|Xo] > for every value of fc. 

The above establishes that the Nemo Process does not 
satisfy the fc-crypticity criterion for any finite fc. Thus, 
the Nemo process is c»-cryptic. This means that we can- 
not make use of the fc-cryptic approximation to calculate 
X or E. 

Fortunately, the techniques introduced in Ref. [ij and 
Ref. do not rely on an approximation method. To 
avoid ambiguity denote the statistical complexity we just 
computed as C+. When the techniques are applied to 
the Nemo Process, we find that the process is causally 
reversible (C+ = C~) and has the following forward- 
reverse causal-state conditional distribution: 



Pr(5+|>S-) 



1 



p + q-pq 





A B 


c 




D 


( p 


9(1- 


p) 


E 


q 


Pii- 




F 


\q p{^-q) 








With this, one can calculate E, in closed-form, via: 

E = C+-H[S+\S-] . 
(Again, calculational details will be provided elsewhere.) 



CONCLUSION 

Calculating the excess entropy is, at first 

blush, a daunting task. We are asking for a mutual in- 
formation between two infinite sets of random variables. 



Appeahng to E = I[S;X], we use the compact repre- 
sentation of the e-machine to reduce one infinite set (the 
past) to a (usually) finite set. A process's fc-crypticity 
captures something similar about the infinite set of fu- 
ture variables and allows us to further compact our form 
for excess entropy, reducing an infinite variable set to a 
finite one. The resulting stratification of process space 
is a novel way of thinking about its structure and, as 
long as we know which stratum we lie in, we can rapidly 
calculate many quantities of interest. 

Unfortunately, in the general case, one will not know a 
priori a process's crypticity order. Worse, as far as we are 
aware, there is no known finite method for calculating the 
crypticity order. This strikes us as an interesting open 
problem and challenge. 

If, by construction or by some other means, one does 
know it, then, as we showed, crypticity and E can be 
calculated using the crypticity expansion. Failing this, 
though, one might consider using the expansion to search 
for the order. There is no known stopping criterion, so 
this search may not find fc in finite time. Moreover, the 
expansion is a calculation that grows exponentially in 
computational complexity with crypticity order, as we 
noted. Devising a stopping criterion would be very useful 
to such a search. 

Even without knowing the fc-crypticity, the expansion 
is often still useful. For use in estimating E, it provides 
us with a bound from above. This is complementary to 
the bound below one finds using the typical expansion 
E(L) = — hf^L [17|. Using these upper and lower 

bounds, one may determine that for a given purpose, the 
estimate of x or E is within an acceptable tolerance. 

The crypticity hierarchy is a revealing way to carve 
the space of processes in that it concerns how they hide 
internal state information from an observer. The exam- 
ples were chosen to illustrate several features of this new 
view. The Even Process, a canonical example of order-oo 
Markov, resides instead at the very bottom of this lad- 
der. The two example families show us how fc-cryptic 
is neither a parallel nor independent concept to order- i? 
Markov. Finally, we see in the last example an appar- 
ently simple process with cx)-crypticity. 

The general lesson is that internal state information 
need not be immediately available in measurement val- 
ues, but instead may be spread over long measurement 
sequences. If a process is fc-cryptic and fc is finite, then 
internal state information is accessible over sequences of 
length fc. The existence, as we demonstrated, of processes 
that are cx)-cryptic is rather sobering. (The Appendix 
comments on what happens when one fails to appreciate 
this.) Interpreted as a statement of the impossibility of 
extracting state information, it reminds us of earlier work 
on hidden spatial dynamical systems that exhibit a simi- 
lar encrypting of internal structure in observed spacetime 
patterns T^. 

Due to the exponentially growing computational effort 
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to search for the crypticity order and, concretely, the 
existence of cx)-cryptic processes, the general theory in- 
troduced in Ref. [l[ and Ref. is seen to be necessary. 
It allows one to directly calculate E and crypticity and 
to do so efficiently. 

Acknowledgments 

Chris Ellison was partially supported on a GAANN 
fellowship. The Network Dynamics Program funded by 
Intel Corporation also partially supported this work. 

APPENDIX: CRYPTICITY UNTAMED 

Recently, Ref. [ll| asserted that a process's E can be 
obtained from its e-machine using the following expres- 
sion: 

E — lerased , 

where lerased = [^o , ^o] [^i , Xq] . Though renamed, 
lerased IS the crypticity of Ref. [1[. However, as we showed 
in the main development, it is ^tnd so the above 

expression is valid only for 0-cryptic and 1-cryptic pro- 
cesses. 

Ref. [l^ considered only the Even and Golden Mean 
Processes. These, as we saw, are 0-cryptic and 1-cryptic 
and so it is no surprise that the expression worked. In- 
deed, their low-order crypticity is why closed-form ex- 
pressions for their excess entropies have been known for 
quite some time, prior to the recent developments. 

In short, the claims in Ref. [l^ are incorrect. The im- 
plication there that all e-machines are 1-cryptic is also. 
The examples we gave show how wrong such an approxi- 
mation can be. We showed how large the errors can grow. 
The full theory of Ref. [H and Ref. [2] is required. The 
richness of the space of processes leads us to conjecture 
that it will suffer no shortcuts. 
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